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Abstract — Shearer and McEliece ^ showed that there is no 
MacWilliams identity for the free distance spectra of orthogonal 
linear convolutional codes. We show that on the other hand 
there does exist a MacWilliams identity between the generating 
functions of the weight distributions per unit time of a linear 
convolutional code C and its orthogonal code C^, and that 
this distribution is as useful as the free distance spectrum for 
estimating code performance. These observations are similar to 
those made recently by Bocharova et al. |1|; however, we focus 
on terminating by tail-biting rather than by truncation. 

I. Introduction 

Finding a MacWilliams-type identity for convolutional 
codes is a problem of long standing |8l. For a linear time- 
invariant convolutional code C over a finite field, the most 
commonly studied distance distribution is the /ree (Hamming) 
distance spectrum, namely, the distribution of (Hamming) 
weights of codewords in C that start and end in the zero state 
without passing through an intermediate zero state. Shearer 
and McEliece |8| showed by example that the free distance 
spectrum of C does not in general determine that of C^, and 
therefore that there could be no MacWilliams identity for such 
distributions. 

Gluesing-Luerssen and Schneider (GLS) have recently for- 
mulated |6| and proved [7| a Mac Williams-type identity for 
convolutional codes involving the Hamming weight adjacency 
mati-ix (HWAM) of a convolutional code C and the HWAM of 
its orthogonal code C^. In |3|, |4|, the GLS result was proved 
in a different way, and generalized to various kinds of weight 
adjacency matrices and to group codes defined on graphs. 

More recently, Bocharova, Hug, Johannesson and 
Kudryashov [1] have proved a different Mac Williams- 
type identity for truncations of a convolutional code C and 
its orthogonal code C^, and have shown that by letting 
the truncation length become large, an approximation to 
the free distance spectrum can be obtained. In this paper, 
which is mostly based on |f4l|, we derive similar results for 
weight distributions of block codes obtained by various kinds 
of termination procedures, of which we regard tail-biting 
as the nicest. We argue that these alternative distributions 
are just as useful for estimating code performance as the 
free distance spectrum. These results effectively answer the 
original question posed by Shearer and McEliece |8|. 



II. Terminated convolutional codes 

A general method for approximating the free distance 
spectrum of a linear convolutional code C is to derive a 
series of block codes Cn of length N from C by some 
sort of termination procedure, and then to study the distance 
distributions of Cn as iV — > oo. As we shall see in Section 
III, for any of the termination methods below, the distance 
distribution of Cn, normalized by N, approaches the free 
distance spectrum of C for dficc < d < 2dfree- However, 
the usual termination methods are problematic if we are 
also interested in the distance distribution of the orthogonal 
convolutional code C^. 

For example, the most common termination method is to 
take the subcode C[q n) of C, consisting of all sequences in 
C whose support is contained in the interval [0, N) = {fc G 
Z \ < k < N} (i.e., all code sequences that pass through 
the zero state at times and N, restricted to [0, N)), which 
is effectively a block code of length N time units. As a 
subcode of C, C[q n) has at least the minimum (free) distance 
of C. However, the orthogonal code to C[q n) is the projection 
(C^)|[o,Ar) of the orthogonal convolutional code C^ onto the 
interval [0, N) (i.e., all orthogonal code sequences that pass 
through any state at times and N, restricted to [0, N)). In 
general, a projection (C^)|[o,Ar) has low-weight codewords, no 
matter how large N becomes. 

Bocharova et al. HI have considered another kind of termi- 
nated code that they call a truncated code, which we will 
denote by C^[qn)- Such a code may be described as the 
subcode C[o,cx)) projected onto [0, A^): i.e., 

C<[0,N) — (C[o,oo))|[0,Ar)- 

By projection/subcode duality, the dual code {C.^^q n))'^ is 

((C[o,oo))|[0,Ar))^ = ((C[o,oo)))^)[0,W) = ((C^) | [0,00)) ) [0,Ar) ; 

i.e., the subcode defined on [0,-/V) of the projection of the 
orthogonal convolutional code C^ onto [0, oo). Alternatively, 
(C<|[o,Ar))^ may be defined as a reverse-truncated code |[T1 

(C^)>[0,7V) = {{C^)(-oo.N)))\[0,N)- 

Thus there is a MacWilliams identity between the weight 
distributions of C^^q n) ™d (C^)^[o.7v)- Moreover, unlike the 



subcode C[o jv) or the projection C^^q ^), ^ truncated code 
has the same rate as C. However, truncated codes have small 
minimum distance. 

A more elegant method of terminating a convolutional code 
C is via tail-biting. The tail-biting terminated code C||[o.Ar) of 
the convolutional code C on the interval [0,A^) is the set of 
all codewords in C that pass through the same state at times 
and A^, restricted to [0,iV). For large enough A^, the tail- 
biting code C||[o^jv) has the same minimum distance as C; 
moreover, C||[ojv) has the same rate as C. Most importantly, 
the orthogonal code to C||[o.Ar) is (C^)||[o^Ar), the tail-biting 
terminated code of on the same interval [2|. Thus there 
is a MacWilliams identity between the weight distributions 
of C||[o^Ar) and (C^)||[o.Ar). Finally, we will see that these 
distributions approach the free distance spectra of C and 
nicely as A^ ^> oo. 

Example 1 (rate- 1/2 4-state binary linear convolutional code). 
Consider the rate- 1/2 binary linear time-invariant convolutional 
code C with degree-2 generators (1 + , 1 + D + D^), in 
standard D-transform notation. A minimal encoder for this 
code is the linear time-invariant system with impulse response 
(11, 01, 11, 00, . . .), which has the 4-state trellis section shown 
in Figure 1(a). The orthogonal convolutional code is 
the rate- 1/2 binary linear convolutional code with generators 
(1 + _D + , 1 + D^), which has a minimal 4-state linear 
encoder with impulse response (11, 10, 11, 00, . . .), and the 4- 
state trellis section shown in Figure 1(b). 




(a) (b) 

Fig. 1. Trellis sections of (a) rate- 1/2 4-state binary convolutional code C; 
(b) orthogonal code . 

We now consider various methods of terminating this con- 
volutional code C with a block length of A^ = 4. The subcode 
C[o4) is the (8, 2) binary linear block code generated by the 
two generators 

11 01 11 00 
00 11 01 11 

The minimum distance of this block code is the same as that 
of C, namely 5, although its rate is lower. The orthogonal 
code to the subcode C[o.4) is the projection (C-^)|[o.4) of the 
orthogonal convolutional code C^, which is the (8, 6) binary 
hnear block code generated by the six generators 

11 00 GO GO 

10 11 GG GG 

11 10 11 GG 
GO 11 IG 11 
GO 00 11 IG 
GO 00 GG 11 



The minimum distance of this block code is 2, less than that 
of C^, although its rate is higher. 

The truncated code C^[o,4) is the (8,4) binary linear block 
code generated by 

11 01 11 00 
00 11 01 11 
00 00 11 01 
00 00 00 11 

The minimum distance of this block code is 2, but its rate is 
the same as that of C. Its orthogonal code (C-'")i>[o.4) is the 
(8, 4) binary linear block code generated by 

11 00 00 00 

10 11 00 00 

11 10 11 00 
00 11 10 11 

which has the same parameters. 

The tail-biting terminated code C||[o.4) is the (8,4) binary 
linear block code generated by 

11 01 11 00 

00 11 01 11 
11 00 11 01 

01 11 00 11 

whereas the orthogonal tail-biting terminated code (C^)||[ojv) 
is the (8,4) binary linear block code generated by the four 
generators 

11 10 11 00 
00 11 10 11 
11 00 11 10 
10 11 00 11 

Both of these codes have a minimum distance of only 2 {e.g., 
for paths such as 01 00 01 00 from state 10 to state 10). 
However, for A^ > 10, it turns out that the minimum distance 
of both tail-biting terminated codes is 5, the same as the 
minimum distance of C or C^. □ 

III. Free distance spectra for convolutional 

CODES from terminated CODES 

Let us now consider how the free distance spectrum of a 
linear time-invariant convolutional code C may be derived from 
the weight distribution of a terminated code of length A^ as 

A' ^ oo. 

Without loss of generality, we may assume that C is 
generated by a minimal encoder, which is necessarily non- 
catastrophic: i.e., the unique state sequence associated with 
the all-zero code sequence is the all-zero state sequence. 
Consequently, the lowest-weight words of a terminated code 
as A^ oo must be those that pass through the zero state 
almost all of the time. These code sequences are as follows, 
for the various termination methods we have considered: 
• If we terminate to the subcode C[o,Ar), then code se- 
quences start and end in the zero state, and the lowest- 
weight sequences correspond to the lowest-weight se- 
quences in the free distance spectrum. If the minimum 



free distance is dficc, then for dfroc < d < 2df,-cc 
there will be approximately x Nd sequences in the 
terminated code of weight d, where is the number 
of code sequences of weight d in the free distance 
spectrum of C. Thus, for dftco < d < 2c?troc, the weight 
distribution per unit time of C is the limit of the weight 
distribution of C[o,n) normalized by (divided by) N as 
N oo. For d > 2d[rcc, there will be overcounting — 
e.g., two sequences of weight dfroc may be counted as 
one of weight 2c?fi.co — but we will argue below that 
such overcounting should not affect estimates of code 
performance. 

• If we terminate to the projection C|[ojv), then code 
sequences can start and end in any state, and there will 
be low-weight sequences starting with a low-weight state 
transition s — 0, remaining in state for nearly N 
time units, and then ending with a low-weight transition 
s', where s and s' are not both 0. Thus the minimum 
distance of C|[o,jv) will be less than rffrco for all N. How- 
ever, the number of such low-weight sequences remains 
constant, so after normalization we will eventually see 
the same normalized weight distribution as for C^Q ^y 

• If we terminate to the truncated code C^[o,n)^ then by 
the same argument we will eventually see the same 
normalized weight distribution. In this case the total 
weight of a code sequence starting in the zero state, 
remaining there for nearly N time units, and then ending 
with a low-weight transition — ^ s, is only that of 
the low-weight transition s. However, again the 
number of such low-weight sequences remains constant, 
so after normalization we will eventually see the correct 
normalized weight distribution. 

• If we terminate to the tail-biting code C||[o.Ar), then by 
the same argument we wiU eventually see the same 
normalized weight distribution. Note however that in this 
case the total weight of a code sequence starting with 
a low-weight transition s — > 0, remaining in the zero 
state for nearly N time units, and then ending with a 
low-weight transition s, must be at least dfrcc, since 
the ending sequence (corresponding to the state transition 
— !■ s) followed by the starting sequence (corresponding 
to s — 5- 0) must be a code sequence. Thus the minimum 
distance of C||[o,Ar) must equal di^^c for large enough N. 

We conclude that as — > cx) the normalized weight 
distribution of any of these terminated codes approaches the 
free distance spectrum of C for dhcc < d < 2dircc- However, 
only the tail-biting termination has the same rate as C and, for 
N large enough, the same minimum distance dfrco- 

Finally, we argue that the normalized weight distribution of 
any of these terminated codes Cn must yield the same estimate 
of code performance over N time units as the free distance 
spectrum of C, if these estimates are accurate. The probability 
of error event P{£) of C per unit time may be estimated using 
the free distance spectrum. The probability of any error in 
N time units is then estimated as NP{£). If this is a good 



estimate (implying N < 1/P{£)), then the probability of two 
or more error events in time units must be negligible. But 
the probability of any error in decoding C over N time units 
is essentially the same as the probability of block decoding 
error in decoding Cn, which may be estimated by the weight 
distribution of N, which counts codewords that include two 
or more error events. If the probability of two or more error 
events in N time units is negligible, then an estimate based on 
the weight distribution of Cn must approximately agree with 
an estimate based on the free distance spectrum of C. 

IV. Weight adjacency matrices and weight 

GENERATING FUNCTIONS 

We now show how Hamming weight generating functions 
for terminations of a linear time-invariant convolutional code C 
may be derived from the Hamming weight adjacency matrix of 
a minimal linear time-invariant encoder for C. This will allow 
us to state Mac Williams identities for terminated convolutional 
codes, and to estimate code performance. 

Given a linear time-invariant encoder for a convolutional 
code C with state space S and symbol alphabet A, the 
Hamming weight adjacency matrix (HWAM) is the matrix 
A(x) indexed hy S x S whose elements are 

a£T{s,s') 

where x is an indeterminate, T(s, s') is the subset of sym- 
bols a ^ A such that the state/symbol transition ("branch") 
(s, a, s') E S X A X S actually occurs in the encoder, and 
■w{a) is the Hamming weight of the symbol a € A. 

Example 1 (cont.). For the rate-1/2 binary convolutional code 
C of Example 1, the HWAM of the encoder of Figure 1(a) is 



A(x) = 
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For the orthogonal code C , the HWAM of the encoder of 
Figure 1(b) is 



A(x) = 



s/s' 


00 


10 
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00 


1 
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X 



which in this case is the transpose of A(a::). □ 

It is shown in il, Q, 11, El how the HWAM A{x) of 
a minimal encoder for C determines the HWAM A{X) of a 
minimal encoder for and vice versa via a MacWilliams- 
type identity, but we will not need that result here. 

Now it is easy to see that if we take N consecutive trellis 
sections of a minimal encoder for C as a single section, then 
the HWAM of this length- trellis section is simply the A^th 
power A^(a;) of the basic HWAM A{x). 



1 + 2x^ + 



4 I 7 
■ X + X 



2x^ 



x^ + 2x'^ + x^ 
2x^ + + 

[-X 
2x^ 



2x'^ + x^ 



x"^ + x^ + x^ + x^ x^ + x^ + x^ 



Fig. 2. HWAM A (a;) of a section of = 4 time units of Example 1 code C. 
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Example 1 (cont.). Given the HWAM A(a::) above for a 
minimal encoder of our example code C, the HWAM of a 
section consisting of = 2 time units of this code is 



A'{x) 



This shows that there is exactly one path from each state at 
time k to each state at time k + 2, and that the minimum 
Hamming weight of any of these paths (other than the zero 
path) is 1. 

For a section consisting of A^ = 4 time units of this code, 
the HWAM A^(a;) is given in Fig. 2. This HWAM shows that 
there are four paths from each state at time k to each state at 
time k + 4, and that the minimum nonzero Hamming weight 
of any of these paths is 2. □ 

The weight generating functions of various terminated codes 
of C can now be read from these weight adjacency matrices. 
Since the subcode C[o,jv) is the set of all sequences in C that 
pass through the zero states at times and A^, its weight 
generating function is simply the (0,0) element of A^(a;). 
Similarly, since the projection C|[o,Ar) is the set of all sequences 
in C that pass through any states at times and A^, its weight 
generating function is the sum of all elements of A^(a;). 

Since the truncated code C<|[o.Ar) is the set of all sequences 
in C that pass through the zero state at times and any state 
at time A^, its weight generating function is the sum of all 
elements in the first row of A^(x). Similarly, the weight 
generating function of C|>[o,a) is the sum of all elements in 
the first column of A^(a::). 

Finally, since the tail-biting termination C||[ojv) is the set 
of all sequences in C that pass through the same states in 
and Sn, its weight generating function is the sum of 
all diagonal elements of A^(a;); i.e., its trace Tr (A^(a;)). 
Since C||[o.Ar) and (C^)||[o^Ar) are orthogonal block codes, there 
is a MacWilliams identity between their weight generating 
functions. 

Example 1 (cont.). For the rate- 1/2 binary convolutional code 
of Example 1, the Hamming weight generating function of 
the the tail-biting termination C||[o_4) of length 4 is the trace 
of A^{x), namely 1 + 2x'^ + Ax^ + x'^ + Ax^ + Ax^. Since 
A'*(x) is the transpose of A'*(a;), the orthogonal tail-biting 
terminated code (C^)||[o.4) is an equivalent code with the same 
Hamming weight generating function. It is easy to check that 
this Hamming weight generating function is indeed invariant 
under the MacWilliams transform. □ 



Using tail-biting terminated codes, and normalizing the 
weight distribution by dividing by A^, we have that the gener- 
ating function of the normalized Hamming weight distribution 
of C is 

gc{x) = lim -Tr (A^(a;)). 

N—i-oo I\ 

Moreover, there is a MacWilliams identity between gc (x) and 
gc^{x). The performance of C is determined by gc{x), and 
that of by gc±{x). (Similar observations are made in f\\, 
using truncated codes.) 

Example 1 (cont.). For a section consisting of A^ = 16 time 
units of the rate- 1/2 binary convolutional code C of Example 
1, the HWAM A^^{x) (modulo x^) is given in Fig. 3 at the 
top of the next page. Notice that 

Tr (A^^(x)) = 1 + 16x^ + 32a;^ + 642;^ + • • ■ , 

so that normalizing the distribution by dividing by A^ = 16 
already gives the precise free distance spectrum of C for d < 8, 
namely x^ + 2x^ + 4a;^ + • • •. Thus the convergence to the 
limiting generating function gc{x) is rapid and exact. This 
property of tail-biting terminations is not shared by other kinds 
of terminations. □ 

It appears that the behavior of gc{x) might be analyzed by 
using an extension of Perron-Frobenius theory to generating 
function matrices, as in |5|; however, we have not attempted 
such an analysis. 

Example 2 {cf. HI, HI). The two codes proposed by Shearer 
and McEliece JS) for their counterexample make an excellent 
example. The first code is a rate- 1/3 binary linear time- 
invariant convolutional code Ci generated by the degree- 1 
generators (1, 1 + D,!)), i.e., Ci is generated by a minimal 
encoder with impulse response (110,011,000,...), whose 
trellis section is shown in Figure 4(a). The HWAM of this 
encoder is 



Ai{x) = 



1 



nnn 




OOP 




(a) (b) 

Fig. 4. Trellis sections of (a) rate- 1/3 2-state binary convolutional code Ci; 
(b) similar code €2- 

The second code is a rate- 1/3 binary linear time-invariant 
convolutional code C2 generated by the degree- 1 generators 
{D, D, 1 + D), i.e., C2 is generated by a minimal encoder with 



1 + Ux"" + 25x^ + 44a;^ + x^ + 2x* + ^x^ + 8a;'' + 29a;'^ x"^ + Ix^ + ^x^ + Sx'^ + IGx^ x"^ + + ^x^ + 82:'' + 16a;'' 

x^ + 22;* + ^.x^ + Sa;" + 16x^ x? + Sa;** + 8a;'' + 4a;'' + 4x^ 

x^ + x^ + 2x'* + 4x^5 + 8x*' + 29x'' x^ + 2x'^ + 5x® + 12x'' + 3x^ + 8x'' x^ + 3x® + 8x'' 

X^ + 2x'* + 4x5 ^ g^6 _^ ;^g^7 _^ 3^6 ^ g^7 _^ 4^7 ^6 ^ 4^7 

Fig. 3. HWAM A^^(2;) (modulo x*) of a section of A'^ = 16 time units of Example 1 code C. 



impulse response (001, 111, 000, . . .), whose trellis section is 
shown in Figure 4(b). The HWAM of this encoder is 



Since the weights of the — and 1 — ^ 1 transitions are the 
same for C\ and C2, and since the sums of the weights of the 
0^-1 and 1 — transitions are the same, it is evident that the 
weight distributions of the subcodes (Ci)[o.Ar) and (C2)[o.Ar) 
are the same for all N , and that the free distance spectra of 
C\ and C2 are also the same. For the same reason, the weight 
distributions of the tail-biting terminated codes (Ci)||[o,Af) and 
(C2)||[o.Ar) are the same for all iV. 

However, the weight distributions of the projections 
(Ci)|[ojv) and (C2)|[o.Ar) are not the same even for iV = 1. It 
follows that the weight distributions of the subcodes (Cj^)[ojv) 
and (C^)[o,Ar) of their orthogonal codes and are not 
the same, and therefore that their free distance spectra are not 
the same; this was the point of Shearer and McEliece |8|. 

On the other hand, since the weight distributions of the tail- 
biting terminated codes (Ci)||[o.7v) ™d (C2)||[o.Ar) are the same 
for all N , it follows that the weight distributions of the tail- 
biting terminated codes (Cj;^)||[o,Af) and (C^)||[o,7V) <2re the 
same for all N. 

Since the performance of Ci and can be analyzed from 
these weight distributions, it follows that the performance of 
Ci and C2 is effectively the same, despite the difference in 
their free distance spectraQ □ 
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V. Conclusion 

In summary, similarly to [ 1 1, but using tail -biting terminated 
codes, we have shown that there is a MacWilliams identity 
between the generating functions of the weight distributions 
per unit time of a linear convolutional code C and its or- 
thogonal code C^, and that this distribution is as useful as 
the free distance spectrum for estimating code performance. 
These results effectively resolve the puzzle posed by Shearer 
and McEliece H. 
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' Another way of reaching the same conclusion is to observe that Ci and 
C2 are equivalent under a simple time-invariant, finite-memory permutation. 
Therefore and are actually equivalent under the same permutation, 
and thus must have precisely the same performance. 



